Many region-scale analyses in archaeology begin with a simple question: How do site locations relate to landscape attributes, such as elevation, soil type, or distance to water or other resources. Such a question is the foundation of basic geospatial exploratory data analysis, and answering it for a set of landscape attributes is the first step towards doing more interesting things, from interpreting settlement patterns in the past (Kowalewski, 2008), to the construction of sophisticated predictive models of site location (Graves McEwan, 2012; Kohler and Parker, 1986; Maschner and Stein, 1995), to guiding settlement decision frameworks in agent-based simulations (e.g., Axtell et al., 2002; Griffin and Stanish, 2007; Kohler and Varien, 2012).
Of course, archaeological site locations are often sensitive information, and it wouldn’t be prudent to provide them in resource like this. So instead of using actual site locations, we’ll use a point dataset for which we can make a resonable hypothesis concerning landscape attributes: radio tower locations from the US Federal Communications Commission. The radio tower data are somewhat difficult to work with, so I’ve distilled a snapshot of the database (acccessed on February 14, 2017), and posted it online for easy download. The hypothesis we’ll be testing is that radio towers are positioned in unusually high places on the landscape. This is similar to hypotheses we might make about archaeological site locations having to do with defensibility (e.g., Bocinsky, 2014; Martindale and Supernant, 2009; Sakaguchi et al., 2010) or intervisibility and signaling (e.g., Johnson, 2003; Van Dyke et al., 2016).
In this short tutorial, you will learn how to:
sp and rgdal packagesleaflet packageFedData packageThis tutorial is an R Markdown HTML document, meaning that all of the code to perform the calculations presented here was run when this web page was built—the paper was compiled. “Executable papers” such as this one are fantastic for presenting reproducible research in such a way that data, analysis, and interpretation are each given equal importance. Feel free to use this analysis as a template for your own work. All data and code for performing this analysis are available on Github at https://github.com/bocinsky/r_tutorials.
# Download the 1:500000 scale counties shapefile from the US Census
download.file("http://www2.census.gov/geo/tiger/GENZ2015/shp/cb_2015_us_county_500k.zip",
destfile = "./data/cb_2015_us_county_500k.zip")
# Unzip the file
unzip("./data/cb_2015_us_county_500k.zip",
exdir = "./data/counties")
# Load the shapefile
Whitman <- rgdal::readOGR(dsn = "./data/counties/", layer = "cb_2015_us_county_500k")
## OGR data source with driver: ESRI Shapefile
## Source: "./data/counties/", layer: "cb_2015_us_county_500k"
## with 3233 features
## It has 9 fields
## Integer64 fields read as strings: ALAND AWATER
# Select Whitman county
Whitman <- Whitman[Whitman$NAME == "Whitman",]
# Transform to geographic coordinates
Whitman %<>%
sp::spTransform("+proj=longlat")
# Get a polygon of the rectangular extent of Whitman county. This is our study area.
Whitman_extent <- Whitman %>%
FedData::polygon_from_extent()
# Load cell tower location data straight from the internet using a URL path
cell_towers <- readr::read_csv("https://raw.githubusercontent.com/bocinsky/r_tutorials/master/data/cell_towers.csv")
## Parsed with column specification:
## cols(
## `Unique System Identifier` = col_integer(),
## `Entity Name` = col_character(),
## `Height of Structure` = col_double(),
## Latitude = col_double(),
## Longitude = col_double()
## )
# Create a SpatialPointsDataFrame by adding coordinates
coordinates(cell_towers) <- ~Longitude+Latitude
# And set the projection information
proj4string(cell_towers) <- "+proj=longlat"
# Select cell towers in Whitman county extent
cell_towers <- cell_towers[!is.na(over(cell_towers,Whitman_extent)),]
# Create a quick plot of the locations
leaflet(width = "100%") %>% # This line initializes the leaflet map, and sets the width of the map at 100% of the window
addProviderTiles("OpenTopoMap", group = "Topo") %>% # This line adds the topographic map from Garmin
addProviderTiles("OpenStreetMap.BlackAndWhite", group = "OpenStreetMap") %>% # This line adds the OpenStreetMap tiles
addProviderTiles("Esri.WorldImagery", group = "Satellite") %>% # This line adds orthoimagery from ESRI
addProviderTiles("Stamen.TonerLines", # This line and the next adds roads and labels to the orthoimagery layer
group = "Satellite"
) %>%
addProviderTiles("Stamen.TonerLabels",
group = "Satellite"
) %>%
addPolygons(data = Whitman_extent, # This line adds the Whitman county extent polygon
label = "Whitman County",
fill = FALSE,
color = "black") %>%
addPolygons(data = Whitman, # This line adds the Whitman county polygon
label = "Whitman County",
fill = FALSE,
color = "red") %>%
addMarkers(data = cell_towers,
popup = ~htmlEscape(`Entity Name`)) %>% # This line adds cell tower locations
addLayersControl( # This line adds a controller for the background layers
baseGroups = c("Topo", "OpenStreetMap", "Satellite"),
options = layersControlOptions(collapsed = FALSE),
position = "topleft"
)
FedDataFedData is an R package that is designed to take much of the pain out of downloading and preparing data from federated geospatial databases. For an area of interest (AOI) that you specify, each function in FedData will download the requisite raw data, crop the data to your AOI, and mosaic the data, including merging any tabular data. Currently, FedData has functions to download and prepare these datasets:
In this analysis, we’ll be downloading the 1 arc-second elevation data from the NED under Whitman county, Washington. The FedData functions each require four basic parameters:
template defining your AOI, either supplied as a spatial object (spatial* or raster* or a spatial extent objectlabel) identifying your AOI, used for file namesraw.dir) defining where you want the raw data to be stored; this will be created if necessaryextraction.dir) defining where you want the extracted data for your AOI to be stored; this will also be created if necessaryHere, we’ll download the 1 arc-second NED with the get_ned() function from FedData, using the Whitman SpatialPolygonsDataFrame object that we created above as out template, and local relative paths for our raw.dir and extraction.dir. We’ll download and prepare the Whitman county NED, and then
Whitman_NED <- FedData::get_ned(template = Whitman,
label = "Whitman",
raw.dir = "./OUTPUT/RAW/NED/",
extraction.dir = "./OUTPUT/EXTRACTIONS/NED/")
# Print the class of the Whitman_NED object
class(Whitman_NED)
## [1] "RasterLayer"
## attr(,"package")
## [1] "raster"
# Print the Whitman_NED object
Whitman_NED
## class : RasterLayer
## dimensions : 3037, 4357, 13232209 (nrow, ncol, ncell)
## resolution : 0.0002777778, 0.0002777778 (x, y)
## extent : -118.2494, -117.0392, 46.41694, 47.26056 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +no_defs
## data source : /Users/bocinsky/git/r_tutorials/OUTPUT/EXTRACTIONS/NED/Whitman_NED_1.tif
## names : Whitman_NED_1
# Plot the Whitman_NED object
Whitman_NED %>%
plot()
# Plot the Whitman county polygon over the elevation raster
Whitman %>%
plot(add = T)
As you can see, the NED elevation data was downloaded for the Whitman county area, and cropped to the rectangular extent of Whitman county.
# Extract cell tower elevations from the Whitman NED values
cell_towers$`Elevation (m)` <- Whitman_NED %>%
raster::extract(cell_towers)
## Warning in .local(x, y, ...): Transforming SpatialPoints to the CRS of the
## Raster
cell_towers_densities <- cell_towers$`Elevation (m)` %>%
density(from = 150,
to = 1250,
n = 1101) %>%
tidy() %>%
tibble::as_tibble() %>%
dplyr::mutate(y = y * 1101) %>%
dplyr::rename(Elevation = x,
Frequency = y)
# Load the NED elevations into memory for fast bootstrapping
Whitman_NED_values <- Whitman_NED %>%
values()
# Draw 999 random samples, and calculate their densities
Whitman_NED_densities <- foreach(n = 1:999, .combine = rbind) %do% {
Whitman_NED_values %>%
sample(length(cell_towers),
replace = FALSE) %>%
density(from = 150,
to = 1250,
n = 1101) %>%
tidy() %>%
tibble::as_tibble() %>%
dplyr::mutate(y = y * 1101)
} %>%
group_by(x) %>%
do({
quantile(.$y, probs = c(0.025, 0.5, 0.975)) %>%
t() %>%
tidy()
}) %>%
set_names(c("Elevation", "Lower CI", "Frequency", "Upper CI"))
g <- ggplot() +
geom_line(data = Whitman_NED_densities,
mapping = aes(x = Elevation,
y = Frequency)) +
geom_ribbon(data = Whitman_NED_densities,
mapping = aes(x = Elevation,
ymin = `Lower CI`,
ymax = `Upper CI`),
alpha = 0.3) +
geom_line(data = cell_towers_densities,
mapping = aes(x = Elevation,
y = Frequency),
color = "red")
ggplotly(g)
# Draw 999 random samples from the NED, and compute two-sample Wilcoxon tests (Mann-Whitney U tests)
Whitman_Cell_MWU <- foreach(n = 1:999, .combine = rbind) %do% {
Whitman_sample <- Whitman_NED_values %>%
sample(length(cell_towers),
replace = FALSE)
MWU <- wilcox.test(x = cell_towers$`Elevation (m)`,
y = Whitman_sample,
alternative = "greater",
exact = FALSE) %>%
tidy() %>%
tibble::as_tibble()
CLES <- outer(X = cell_towers$`Elevation (m)`,
Y = Whitman_sample,
FUN = "-")
CLES <- ifelse(CLES == 0, 0.5, CLES > 0) %>%
mean() %>%
tibble::tibble(CLES = .)
return(MWU %>%
bind_cols(CLES))
} %>%
dplyr::select(statistic, p.value, CLES)
Whitman_Cell_MWU <- foreach::foreach(prob = c(0.025,0.5,0.975), .combine = rbind) %do% {
Whitman_Cell_MWU %>%
dplyr::summarise_all(quantile, probs = prob)
} %>%
t() %>%
round(digits = 2) %>%
magrittr::set_colnames(c("Lower CI","Median","Upper CI")) %>%
magrittr::set_rownames(c("U statistic","p-value","CLES"))
Whitman_Cell_MWU
## Lower CI Median Upper CI
## U statistic 1518.90 1656.00 1799.00
## p-value 0.00 0.00 0.00
## CLES 0.75 0.82 0.89
The results of the bootstrapped Mann-Whitney U two-sample tests demonstrate that it is highly likely that the radio towers in Whitman county were placed on unusually high places on the landscape (median U statistic = 1656, median p-value = 0), and the median common language effect size of 0.82 suggests that 82 percent of randomly drawn radio towers will be at a higher elevation than another randomly drawn location in the county.
Axtell, R.L., Epstein, J.M., Dean, J.S., Gumerman, G.J., Swedlund, A.C., Harburger, J., Chakravarty, S., Hammond, R., Parker, J., Parker, M., 2002. Population growth and collapse in a multiagent model of the Kayenta Anasazi in Long House Valley. Proceedings of the National Academy of Sciences 99, 7275–7279.
Bocinsky, R.K., 2014. Extrinsic site defensibility and landscape-based archaeological inference: An example from the Northwest Coast. Journal of Anthropological Archaeology 35, 164–176.
Graves McEwan, D., 2012. Qualitative landscape theories and archaeological predictive modelling—A journey through no man’s land? Journal of Archaeological Method and Theory 19, 526–547.
Griffin, A.F., Stanish, C., 2007. An agent-based model of prehistoric settlement patterns and political consolidation in the Lake Titicaca Basin of Peru and Bolivia. Structure and Dynamics 2, 1–47 [online].
Johnson, C.D., 2003. Mesa Verde region towers: A view from above. Kiva 68, 323–340.
Kohler, T.A., Parker, S.C., 1986. Predictive models for archaeological resource location. Advances in Archaeological Method and Theory 9, 397–452.
Kohler, T.A., Varien, M.D. (Eds.), 2012. Emergence and collapse of early villages: Models of central mesa verde archaeology. University of California Press, Berkeley, California.
Kowalewski, S.A., 2008. Regional settlement pattern studies. Journal of Archaeological Research 16, 225–285.
Martindale, A., Supernant, K., 2009. Quantifying the defensiveness of defended sites on the Northwest Coast of North America. Journal of Anthropological Archaeology 28, 191–204.
Maschner, H.D., Stein, J.W., 1995. Multivariate approaches to site location on the Northwest Coast of North America. Antiquity 69, 61–73.
Sakaguchi, T., Morin, J., Dickie, R., 2010. Defensibility of large prehistoric sites in the Mid-Fraser region on the Canadian Plateau. Journal of Archaeological Science 37, 1171–1185.
Van Dyke, R.M., Bocinsky, R.K., Robinson, T., Windes, T.C., 2016. Great houses, shrines, and high places: Intervisibility in the Chacoan World. American Antiquity 81, 205–230.